Parsing with discontinuous phrases

نویسنده

  • Allan Ramsay
چکیده

Most parsing algorithms require phrases that are to be combined to be either contiguous or marked as being ‘extraposed’. The assumption that phrases which are to be combined will be adjacent to one another supports rapid indexing mechanisms: the fact that in most languages items can turn up in unexpected locations cancels out much of the ensuing efficiency. The current paper shows how ‘out of position’ items can be incorporated directly. This leads to efficient parsing even when items turn up having been right-shifted, a state of affairs which makes Johnson and Kay [1994]’s notion of ‘sponsorship’ of empty nodes inapplicable.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discontinuous Verb Phrases in Parsing and Machine Translation of English and German

In this paper, we focus on the verb-particle (V-Prt) split construction in English and German and its difficulty for parsing and Machine Translation (MT). For German, we use an existing test suite of V-Prt split constructions, while for English, we build a new and comparable test suite from raw data. These two data sets are then used to perform an analysis of errors in dependency parsing, word-...

متن کامل

Synchronous Rewriting in Treebanks

Several formalisms have been proposed for modeling trees with discontinuous phrases. Some of these formalisms allow for synchronous rewriting. However, it is unclear whether synchronous rewriting is a necessary feature. This is an important question, since synchronous rewriting greatly increases parsing complexity. We present a characterization of recursive synchronous rewriting in constituent ...

متن کامل

Optimal Rank Reduction for Linear Context-Free Rewriting Systems with Fan-Out Two

Linear Context-Free Rewriting Systems (LCFRSs) are a grammar formalism capable of modeling discontinuous phrases. Many parsing applications use LCFRSs where the fan-out (a measure of the discontinuity of phrases) does not exceed 2. We present an efficient algorithm for optimal reduction of the length of production right-hand side in LCFRSs with fan-out at most 2. This results in asymptotical ru...

متن کامل

A Synchronous Context Free Grammar using Dependency Sequence for Syntax-based Statistical Machine Translation

We introduce a novel translation rule that captures discontinuous, partial constituent, and non-projective phrases from source language. Using the traversal order sequences of the dependency tree, our proposed method 1) extracts the synchronous rules in linear time and 2) combines them efficiently using the CYK chart parsing algorithm. We analytically show the effectiveness of this translation ...

متن کامل

An Optimal-Time Binarization Algorithm for Linear Context-Free Rewriting Systems with Fan-Out Two

Linear context-free rewriting systems (LCFRSs) are grammar formalisms with the capability of modeling discontinuous constituents. Many applications use LCFRSs where the fan-out (a measure of the discontinuity of phrases) is not allowed to be greater than 2. We present an efficient algorithm for transforming LCFRS with fan-out at most 2 into a binary form, whenever this is possible. This results...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Natural Language Engineering

دوره 5  شماره 

صفحات  -

تاریخ انتشار 1999